Exhibit A 

Structural Domain Analysis of SEQ ID NO; 2 



INTERPRO 



InterPro is a database of protein families, domains. and functional sites -in 
which identifiable features found in known proteins can be applied to. unknown- 
protein sequences . ht tp : / /www . ebi . ac . uk/interpro/ 
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Pfam 

Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein 
domains and families. For each family in Pfam you can: Look at multiple alignments; View protein domain architectures; 
Examine species distribution; Follow links to other databases; View known protein structures. 
http://www.sanger.ac.uk/Software/Pfam/index.shtml 

Model Seq-from Seq-to HMM-from HMM-to Score E-value Alignment Description 

!! sugar tr 23 429 A 487 -119.5 0.0007 glocal Sugar (and other) transporte 



sugar tr: domain 1 of 1, from "23 to 429: score -119.5, E = 0.0007 

*->valvaalgGgflfGyDtgviggflalidflfrfglltssgalaslvg 

+ a++G++l G + +++ +++i + +++ a 
23 — CQAWTGTLLLGTCLLYCARSSMPICTVSMSQDFGWNKKEA 62 
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y s tvl tg Iwsi f f lGr 1 iGs 1 f aGkl gdr f GRkks 1 1 i al vl f vi 

g+v s+ff G + +++G+lgdr+G k +1++++ + + ++ 
63 GIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSAsawgSITAV 106 

GallsgaapgytTiGlwafyllivGRvlvGlgvGgasvlvPmYisEiAPk 
H +++ +++ + + R+l+Gl G+ + + ++s+ + 

107 TPLLAHLSS AHLAFMTFSRILMGLLQGVYFPALTSLLSQKVRE 149 
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+ R++ s+ + ++G 1++ +g 1 + ++ W + +++ 
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-WQSIFYFSG 191 



150 SERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYG 

vpalllligllflPESPRwLvekgkleeArevLaklrgvedvdqeiqeek 
++ 1++++ + R+L++ ++1 A vLa++r + 
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192 GLT- - -LLWVWYVY- — RYLLSEKDLILALGVLAQSR- 



P 223 



aeleatvseekagkaswgelfrgrtpkyrqrllmgymlqafqQltGiNai 
++ + v + w+ lfr + ++.+V+ q+ . + + 

224 VSRHSRVP— WRRLFRKPA- -VWAAWSQLS A- AC SFF I L 258 

fYYsptifksvGvsdsvasilvtiivgwNfvfTfvaLif lvDr. . . . . . 

+ pt+f+ + + +++' +v + + +++ +fl+D+ +++ 

259 LSWLPTFFEETFPDAK— GWIFNWPWLVAIPASLFS-GFLSDHlinqgy 305 

. .fGRRplll.lGaagmaicflilgasvivallllnkpkdpsskaagiva 

+ ++ R+1+ ++G+ . ++++1 lg +++++ 
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ivfillfiafFalgwGpipwvilsElFPtkvRskalalataanwlanfii 

+ 1 1 + ++ g+ v P ++ + +++a a+ la+++ 

343 S ASIGLQTFNHS -GI S VNIQDLAP - SC AGFLFGVANTAGALAGWG 38.6 
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VFA 342 



gflfpyitgaiglalggyvflvfagllvlfilfvfffvPETkGrtLEe 

1 y+ +++ g + f++ a++ 1+ + f+v G + 
387 VCLGGYLMETTG— SWTCLFNLVAIISNLGL— CTFLVF-— G 
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ProtComp 

Kn?SanrSS^ 3 ty in Location DB - — ion^las^^ e^ane 

Database sequence: AC=Q9BYT1 Location: Plasma membrane DE BA305P22 . . 1 i«o 

protein, isoform 1). , ,.,.-„ 

Score=21855, Sequence length=430, Alignment l^gth-422 

Predicted, by Neural Nets - Plasma membrane "^h *core ^ , 399 :4 12-. - 

******** Transmembrane segments are found: .+166:179+. .+275.^ .. 
******** Potential GPI-anchor in position 414 is found _ 
integral Prediction of protein^locat ion: Plasma membrane wxth score 
Location weights: 
Nuclear 

Plasma membrane 

Extracellular 
Cytoplasmic 
Mitochondrial 
Endoplasm. retic. 
Peroxisomal 
Lysosomal 
Golgi 



7.8 



LocDB 
0.0 
21855.0 
0.0 



0.0 
0.0 
0.0 / 
0.0 / 
0.0 / 
0.0 / 



PotLocDB 
.0 
.0 
.0 
.0 
.0 
.0 
.0 
.0 
.0 



Neural Nets 
.73 
,91 
.72 
.66 
.69 
.71 
.38 
.26 
.24 



Integral 
0.73 
7.77 
0.72 
0.66 
0.69 
0.71 
0.38 
0.26 
0.24 



SMART ANALYSIS 



SMART (a Simple Modular Architecture Research Tool) allows the identification and 
annotation of genetically mobile domains and the analysis of domain architectures. More 
than 500 domain families found in signalling, extracellular and chromatin-associateo\ 
proteins are detectable/These domains are extensively annotated with respect to phyletic 
distributions, functional class, tertiary structures and functionally important residues. Each 
domain found in a non-redundant protein database as well as search parameters and 
taxonomic information are stored in a relational database system. User interfaces to this 
database allow searches for proteins containing specific combinations of domains in 
defined taxa. For all the details, please refer to the publications on SMART: 
http://smartpmhl-heidelbera.de/ 



Domains within the query sequence sequence of 436 residues 



1 100 200 
I 1 -1 




Smart Analysis tells of a sagar_tr domain 
Accession number: PF00083 
Sugar (and other) transporter 



TMHMM 

Prediction of transmembrane helices in proteins 
ht tp : / /www . cbs . dtu . dk/ services /TMHMM/ 



# sequence 

# sequence 

# sequence 

# sequence 

# sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 
sequence 



Length: 436 

Number of predicted TMHs : 
Exp number of.AAs in TMHs 
Exp number, first 60 AAs: 



215.57889 
6 .90461 
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TMHMM posterior probabilities for sequence 




50 
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ProDom 

ProDom is a comprehensive set of protein domain families automatifly generated from the SWiSS-PROT and TrEMBL 
sequence databases. Nucl. Acids. Res. Corpet et al. 28 (1): 267. 
http://prodes.toulouse.inra.fr/prodom/2002.1/html/home.php 



Sarning: 'original output has been filtered to yield rion-redundant similarities 
BLASTP 2.2.1 [Apr-13-2001] 

Reference: Altschul, Stephen P.. Thomas L. Madden Alejandro A. gaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and D ^^ r J.^ P ^ a i a ^ 7 s ; arch 
"Gapped BLAST and PSI-BLAST: a new generatxon of protein database searcn 
programs", Nucleic Acids Res. 25 : 3389-3402 . 



Query= 



(436 letters) 



Database: ProDom 2002.1 Jan2003 multiple alignments 

1,619,602 sequences; 167,025,341 total letters 

Searching 



.done 



ProDom domains producing High-scoring Segment 



Position 


ProDom domain 


13- 


-86 


#PD004810 


38- 


-104 


#PD003131 


38- 


-104 


#PD523332 


38- 


-107 


. #PD535883 


45- 


-95 


#PD413016 


48- 


-108 


#PD543895 


55- 


-32. 


#PD063885 


55- 


-108 


#PD000036 


67- 


-108 


#j?D000082 


87< 


-132 


#PD513011 


125- 


-191 


#PD078712 


133- 


-192 


#PD000916 . 


207- 


-255 


#PD434467 


255 


-318 


#PD413016 


256 


-325 


#PD001152 


279 


-346 


#PD394380 


327 


-371 


#PD286146 


347 


»427 


#PD508204 



Score E value 



325, 
82 
86 
107 
89 
87 
89 
95 
109 
165 
87 
310 
244 
113 
369 
183 ° 

91 
311 



4e-31 
0.007 
0.002 
8e-06 
0.001 
0.002 
0.001 
2e-04 
5e-06 
2e-12 
0.002 
2e-29 
le-21 
2e-06 
4e-36 
le-14 
6e-04 
2e-29 



> PDQ01152 (Closest domain: Q9BYT1_HUMAN 250-319) 
Number of domains in family: 

Commentary (automatic) : . „^^ Tm 

TRANSPORTER INORGANIC RENAL SODIUM NA- DE PENDENT 

Length =70 
Score = 369 (146 bits) , Expect = 4e-36 
Identities = 70/70 (100%), Positives- 70/70 (100%) 

256 FILLSWLPTFFEETFPDAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVRKLMQ 315 

FILLSWLPTFFEETFPDAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVRKLMQ 
250 FILLSWLPTFFEETFPDAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVRKLMQ 309 



Query: 
Sbjct : 
Query: 
Sbjct: 



316 GMGLGLSSVF 325 

GMGLGLSSVF 
310 GMGLGLSSVF 319 



> PD004810 (Closest domain: Q8VCL5_MOUSE 11-89) 
Number of domains, in' family: . 

Commentary {autOm ^^C0PR0TEIN CHROMOSOME NA-DEPENDENT III SYMPORT SODIUM 

Length =79 
Score = 325 (129 bits), Expect = 4e-31 
Identities = 57/74 (77%), Positives = 61/74 (82%) 



Query: 
Sbjct: 



13 AGDTQWSRPECQAWXXXXXXXXX^^ 72 

A DT+WSRPECQAW YCAR +MP+CTV+MSQDFGWNKKEAGIVLSSFFWG 

16 AEDTRWSRPECQAWTGILLLGTCLLYCARVTMPVCTVAMSQDFGWNKKEAGIVLSSFFWG 75 



Query: 73 YCLTQWGGHLGDR 86 

YCLTQWGGHLGDR 
Sbjct: 76 YCLTQWGGHLGDR 89 



> PD508204 (Closest domain: Q9BYT1_HUMAN 341-421) 
Number of domains in family: 
Commentary (automatic) : 

NA- DEPENDENT BAC NOVEL SIMILAR THALIANA ARABIDOPSIS 

Length =81 
Score = 311 (124 bits), Expect = 2e-29 
Identities =62/81 (76%), Positives = 62/81 (76%) 

Ouerv- 347 GLQTFNHSGISVNIQDLAPSCAGFLFXXXXXXXXXXXXXXXXXX^ 406 

■ GLQTFNHSGISVNIQDLAPSCAGFLF YLMETTGSWTCLFNL 

Sbjct: 341 GLQTFNHSG I SVN I QDLAPSCAGFLFGVANTAGALAGWGVCLGG YLMETTGSWTCLFNL 400 

Query: 407 VAI ISNLGLCTFLVFGQAQRV 427 

VAI I SNLGLCTFLVFGQAQRV 
Sbjct: 401 VAI ISNLGLCTFLVFGQAQRV 421 



> PD000916 (Closest domain: Q9BYTl_HUMAN 127-199) 
Number of domains in family: 

Commentary (automatic): 

RESISTANCE MEMBRANE PROBABLE MULTIDRUG FAMILY 

Length =73 
Score = 310 (124 bits), Expect = 2e-29 
Identities = 60/60 (100%) , Positives = 60/60 (100%) 

Ouerv • 133 GVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYGWQSIFYFSGG 192 

GVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYGWQSIFYFSGG 
Sbjct • 127 GVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYGWQSIFYFSGG 186 



>PD434467 (Closest domain: Q9BYTl_HUMAN 200-249) 
Number of domains in family: 1 ^ 
Commentary (automatic) : 

Length = 50 
Score = 244 (98,6 bits), Expect = le-21 
Identities = 48/49 (97%), Positives =49/49 (99%) 

Query 207 SEKDL I LALGVLAQS RPVS RHS RVPWRRLFRKPAVWAAWS QLS AAC SF 255 

SEKDLILALGVLAQSRPVSRH+RVPWRRLFRKPAVWAAWSQLSAACSF 
Sbjct- 201 SEKDLILALGVLAQSRPVSRHNRVPWRRLFRKPAVWAAWSQLSAACSF 249 



> PD394380 (Closest domain: Q9DA66_MOUSE 1-99) 
Number of domains in family: 1 
Commentary (automatic) : 

Length =99 
Score = 183 (75.1 bits), Expect = le-14 
Identities = 41/74 (55%), Positives = 54/74 (72%), Gaps = 7/74 (9%) 



Ouerv- 279 NWPWLVAIPASLFSGFLSDHLIN • QGYRAITVRKLMQGMGLGLSSVFALCLGHT 332 

Query. 2/y ^ +p+++ L s L+ HL+ QGYR ITVRK MQ MGLGLSS+FALCLGHT 

Sbjct: 27 NLLPWLCL-LLLHSTLLAAHLLQGDLPQLQGYRVITVRKFMQVMGLGLSSIFALCLGHT 85 



Query: 333 SSFCESWFASASI 346 

+SF ++++FASASI 
Sbjct: 86 TS FLKAMI FAS AS I 99 



> PD513011 (Closest domain: Q9BYT1JHUMAN 81-126) 
Number of domains in family: 1 
Commentary (automatic) : 

Length =46 
Score = 165 (68.2 bits)., Expect .= 2e-12 
Identities = 35/46 (76%) , Positives = 35/46 (76%) 

Query: 87 IGGEKVILLSASAWGSITAVTPXXXXXXXXXXXFMTFSRILMGLLQ 132 

IGGEKVILLSASAWGSITAVTP FMTFSRILMGLLQ 
Sbjct* 81 IGGEKVILLSASAWGSITAVTPLLAHLSSAHLAFMTFSRILMGLLQ 126 



> PD413016 (Closest domain.: Q8W4P5_ARATH 352-432) 
Number of domains in family: 895 
Commentary (automatic) : , 

MULTIDRUG PROBABLE EFFLUX PERMEASE 

Length =81 
Score = 113 (48.1 bits), Expect = 2e-06 
Identities =24/67 (35%), Positives = 35/67 (51%), Gaps = 4/67 (5%) 

Query: 255 FFILLSWLPTFFEETFP DAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVR 311 

FF++LSW+P +F + W F+ VPW + +GF SD LI +G R 

Sbjct: 353 FFVILSWMPIYFNSWHVNLKQAAW-FSAVPWSMMAFTGYIAGFWSDLLIRRGTSITLTR 411 

Query: 312 KLMQGMG 318 

K+MQ +G . 
Sbjct: 412 KIMQSIG 418 



> PDQ00082 (Closest domain: Q9 SH8 2_ARATH 142-197) 
Number of domains in family: 
Commentary (automatic) : 

RESISTANCE MEMBRANE PROBABLE FAMILY MULTIDRUG 

Length =56 
Score = 109 (46.6 bits), Expect = 5e-06 
Identities = 19/42 (45%), Positives = 26/42 (61%) 

Query: 67 S S FFWGYCLTQWGGHLGDRI GGEKVI LL S AS AWG S ITAVT P 108 

SSF WGY + V+GG L DR GG++V+ + W T +TP 
Sbjct: 142 S S FLWGYIFS SVIGGALVDRYGGKRVLAWGVALWSLATLLT P 183 



> PD535883 (Closest domain: Q 8 Y JH 9_BRUME 1-144) 
Number of domains in family: 1 
Commentary (automatic) : 

Length = 144 
Score = 107 (45.8 bits), Expect = 8e-06 
Identities = 21/70 (30%), Positives = 42/70 (60%), Gaps = 1/70 (1%) 

Query* 38 YCARSSMPICWSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSA 97 

Y R ++ + + ++G+N+ + G +L F +GY ++GG L D++G K+ +++ 
Sbjct* ' 49 YIDRGAISYASEQI IGEYGFNRADWGSMLGYFGYGYMFGAILGGTLSDKLGARKLWI I AG 108 



Query: 
Sbjct : 



98 SAWGSITAVT 107 
+AW SI AV+ 
109 TAW-SIVAVS 117 



> PDQ00036 (Closest domain: Q9V905_DROME 63-130) 

Number of domains in family: 

Commentary (automatic) : 

SODIUM-DEPENDENT CARRIER SODIUM-PHOSPHATE SODIUM FAMILY 

Length =68 
Score = 95 (41.2 bits), Expect - 2.6-04 
Identities = 16/54 (29%) , Positives =- 30/54 (54%) 

Query: 55 FGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSASAWGSITAVTP 108 

+ W + + ++L ++F+GY +T+GL + +G V S G +TA+TP 
Sbjct: 63 YNWTQSDQALLLGAYFYGYMITSLPAGTLAEMLGARNVAGYSCLVAGILTALTP 116 



> PD286146 (Closest domain: Q9SH82_ARATH 407-561) 
Number of domains in family: 1 
Commentary (automatic) : 

Length =155 
Score =91 (39.7 bits), Expect = 6e-04 
Identities = 18/45 (40%), Positives = 28/45 (62%) 

Query: 327 LCLGHTSSFCESWFASASIGLQTFNHSGISVNIQDLAPSCAGFL 371 

LCL S + VF + ++ L +F+ +G +N+QD+AP AGFL 
Sbjct: 411 LCLNFAKSPSCAAVFMTIALSLSSFSQAGFLLNMQDIAPQYAGFL 455 



> PD413016 (Closest domain: Q99TA7_STAAM 17-104) . 
Number of domains in family: 895 
Commentary (automatic) : 

MULTIDRUG PROBABLE EFFLUX PERMEASE 
Length =88 
Score = 89 (38.9 bits), Expect = 0.001 
Identities = 19/51 (37%), Positives = 31/51 (60%) 

Query: 45 PICTVSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILL 95 

P+ T+ M Q+ G + AG+VL +G + ++GG L D++GG K IL+ 
Sbjct: 26 PLNT I YMKQELGKS LTVAGLVLM INS FGMVI GNLLGG SLFDKLGG YKT I L I 76 



> PD063885 (Closest domain: Q9V763_DROME 1-161) ■ . 
Number of domains in family: 2 
Commentary (automatic) : 

COTRANS PORTER 
Length = 161 
Score = 89 (38.9 bits), Expect = 0.001 
Identities = 16/38 (42%), Positives = 23/38 (60%) 

Query: 55 FGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKV 92 

F WN+K+ G +L SFFW + Q+ GG L + G + V 
Sbjct: 83 FHWNEKQQGALLGSFFWAHWTLQI PGGILATKYGTKLV 120 



>PD078712 (Closest domain: Q23063_CAEEL 5-202) 
Number of domains in family: 3 
Commentary (automatic) : 



Length = 198 
Score = 87 (38.1 bits), Expect =0.002 
Identities = 22/69 (31%), Positives = 34/69 (48%), Gaps = 2/69 (2%) 

Query* 125 RILMGLLQGVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLL--EWYG 182 

R G Q L+ + ESE +F +SI+ A SQFG L T +G + +.+G 

Sbjct : 118 RFFAGFAQASQLHFTNDLVLRWTPESEASFFFSIMLATSQFGPLFTMILGGEMCSSSFFG 177 

Query: 183 WQSIFYFSG 191. 

W++.+Y G 
Sbjct: 178 WEATVYILG 186 



> PD543895 (Closest domain: Q8ZR98_SALTY. 217-325) 
Number of domains in family: 8. 
Commentary (automatic) : 

TRANSMEMBRANE MEMBRANE ANTIBIOTIC 

Length = 109 
Score = 87 (38.1 bits), Expect = 0.002 
Identities =22/61 (36%), Positives = 32/61 (52%), Gaps = 6/61 (9%) ^. 

Query 48 TVSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSASAWGSITAVT. 107 

T +QFG+ + A + L+F+ V+GG +GD+IG + VI WGSI V 

'Sbjct: 251 TFYLMQKFGLSIQNAQLHLFAFLFAVAAGTVIGGPVGDKIGRKYVI WGSILGVA 304 

Query: 108 P 108 

P • . . . . - ■ . . • 

Sbjct: 305 P 305 



> PD523332 (Closest domain: Q8ZK69_SALTY 1-107) 
Number of domains in family: 10 
Commentary (automatic) : 

PERMEASE PROBABLE 2 -KETOGLUCONATE INTEGRAL 

Length = 107 
Score = 86 (37.7 bits), Expect = 0.002 
Identities = 18/67 (26%), Positives = 32/67 (46%) 

Query: 38 YCARSSMPICWSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSA 97 

Y RS++ + +++ D + A IVL+ F GY + + GG R +K+++L 
Sbjct: 22 YLDRSNLSVTLPTITHDLNIDGATASIVLTIFLIGYAFSNIFGGVFTQRYDPKKIVILMV 81 

Query: 98 SAWGSIT 104 
W T 

Sbjct: . 82 LI WS I AT 88 



> PD003131 (Closest domain: Q9RPP3„BURCE 19-130) 
Number of domains in family: 
Commentary (automatic) : 

PLASMID PROBABLE 4 -HYDROXYPHENYL ACETATE MFS PHTHALATE 

Length =112 
Score = 82 (36.2 bits), Expect = 0.007 
Identities = 20/67 (29%) , Positives = 31/67 (45%) 

Query 38 YCARSSMPICWSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSA 97 

Y R ++ + + D"G + G+ +S FF GY L +V L RIG K + 
Sbjct- 45 YLDRVNVSFAQLQLKHDLGLSDAAYGLGVSLFFIGYILLEVPSTLLLRRIGARKTVTRIM 104 



Query: 
Sbjct : 



98 SAWGSIT 104 
WG+I + 
105 LLWGAIS 111 



Parameters: 

Database: P'roDom 2002.1 Jan2003 multiple alignments 

Number of letters in database: 167,025,341 
Number of sequences in database: 1,619,602 

Lambda K H 

0.325 0.138 0.441 " 

Gapped 

Lambda K H 

0.267 ^0.0410 0.140 



